feat(kernelgen): import NKIPyKernelGen as a subfolder by shaojiex-aws · Pull Request #55 · aws-neuron/nkipy

shaojiex-aws · 2026-04-27T18:36:12Z

Import the open_source branch of NKIPyKernelGen into kernelgen/ as a self-contained subpackage. NKIPyKernelGen is a compiler that traces NumPy functions and lowers them to NISA (Neuron Instruction Set Architecture) for AWS Neuron hardware. Users write kernels in Python with @trace and knob.knob() annotations; the compiler handles tiling, memory placement, layout legalization, and NISA lowering.

What's included

kernelgen/nkipy_kernelgen/ — Python tracing frontend:
- trace.py (@trace decorator)
- knob.py (tensor annotations: mem_space, tile_size, reduction_tile, partition_dim)
- traced_array.py (TracedArray wrapping MLIR SSA values)
- op_vtable.py (NumPy op → MLIR lowering table)
- transforms/nkipy_opt.py (pipeline orchestration, shells out to nkipy-opt)
kernelgen/mlir/ — MLIR dialect + C++ passes:
- nkipy.annotate op (target, mem_space, partition_dim, tile_size,
  reduction_tile)
- 20+ transformation passes under mlir/lib/Transforms/ implementing
  the 24-pass compilation pipeline (InferLayout, KnobDrivenTiling,
  AnnotateMemorySpace, LegalizeLayout, InsertSpillReload,
  LinalgToNisa, etc.)
kernelgen/tests/ — test suite:
- passes/ — per-pass FileCheck tests
- e2e/ — end-to-end tests (trace → NISA → BIR sim / HW)
- unit/ — Python-level unit tests
- harness.py — unified test harness with LLVM/BIR_SIM/HW/FileCheck
  modes
kernelgen/examples/ — example kernels
kernelgen/compiler_explorer/ — Compiler Explorer wrapper for inspecting
IR at any pipeline stage
kernelgen/setup.py, pyproject.toml, pytest.ini, requirements.txt
— build + test configuration (pip install -e kernelgen/ builds the
C++ passes via CMake)
kernelgen/CLAUDE.md, README.md — pipeline docs and usage notes

Architecture notes

NKIPyKernelGen depends on the NISA dialect defined in private-nki-staging (the nki wheel). NKIPyKernelGen's nkipy-opt binary performs the tensor-level and bufferization phases; lowering to BIR then runs through the upstream nki-opt-pipeline. This import does not bring in the NISA dialect sources — only NKIPyKernelGen's own passes and frontend.

Ignore rules

Added a !mlir/lib/ override in kernelgen/.gitignore so the parent nkipy repo's lib/ rule (intended for Python venv lib/ dirs) does not silently exclude the MLIR C++ pass sources under kernelgen/mlir/lib/.

Source

Imported from NKIPyKernelGen open_source branch @ commit 973c1be ("fix: correct mem_space enum values in builder.annotate()"). Internal git history is not preserved — this is a single squash import for the open-source release.

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@trace

Import the open_source branch of NKIPyKernelGen into `kernelgen/` as a self-contained subpackage. NKIPyKernelGen is a compiler that traces NumPy functions and lowers them to NISA (Neuron Instruction Set Architecture) for AWS Neuron hardware. Users write kernels in Python with `@trace` and `knob.knob()` annotations; the compiler handles tiling, memory placement, layout legalization, and NISA lowering. What's included --------------- - `kernelgen/nkipy_kernelgen/` — Python tracing frontend: - `trace.py` (@trace decorator) - `knob.py` (tensor annotations: mem_space, tile_size, reduction_tile, partition_dim) - `traced_array.py` (TracedArray wrapping MLIR SSA values) - `op_vtable.py` (NumPy op → MLIR lowering table) - `transforms/nkipy_opt.py` (pipeline orchestration, shells out to `nkipy-opt`) - `kernelgen/mlir/` — MLIR dialect + C++ passes: - `nkipy.annotate` op (target, mem_space, partition_dim, tile_size, reduction_tile) - 20+ transformation passes under `mlir/lib/Transforms/` implementing the 24-pass compilation pipeline (InferLayout, KnobDrivenTiling, AnnotateMemorySpace, LegalizeLayout, InsertSpillReload, LinalgToNisa, etc.) - `kernelgen/tests/` — test suite: - `passes/` — per-pass FileCheck tests - `e2e/` — end-to-end tests (trace → NISA → BIR sim / HW) - `unit/` — Python-level unit tests - `harness.py` — unified test harness with LLVM/BIR_SIM/HW/FileCheck modes - `kernelgen/examples/` — example kernels - `kernelgen/compiler_explorer/` — Compiler Explorer wrapper for inspecting IR at any pipeline stage - `kernelgen/setup.py`, `pyproject.toml`, `pytest.ini`, `requirements.txt` — build + test configuration (`pip install -e kernelgen/` builds the C++ passes via CMake) - `kernelgen/CLAUDE.md`, `README.md` — pipeline docs and usage notes Architecture notes ------------------ NKIPyKernelGen depends on the NISA dialect defined in private-nki-staging (the `nki` wheel). NKIPyKernelGen's `nkipy-opt` binary performs the tensor-level and bufferization phases; lowering to BIR then runs through the upstream `nki-opt-pipeline`. This import does not bring in the NISA dialect sources — only NKIPyKernelGen's own passes and frontend. Ignore rules ------------ Added a `!mlir/lib/` override in `kernelgen/.gitignore` so the parent nkipy repo's `lib/` rule (intended for Python venv `lib/` dirs) does not silently exclude the MLIR C++ pass sources under `kernelgen/mlir/lib/`. Source ------ Imported from NKIPyKernelGen `open_source` branch @ commit 973c1be ("fix: correct mem_space enum values in builder.annotate()"). Internal git history is not preserved — this is a single squash import for the open-source release.

aws-zhehongb approved these changes May 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kernelgen): import NKIPyKernelGen as a subfolder#55

feat(kernelgen): import NKIPyKernelGen as a subfolder#55
shaojiex-aws wants to merge 1 commit intoaws-neuron:feat/kernelgenfrom
shaojiex-aws:feat/kernelgen

shaojiex-aws commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaojiex-aws commented Apr 27, 2026

What's included

Architecture notes

Ignore rules

Source

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants